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ABSTRACT 

We  consider  methods  for  finding  high-precision  approxim- 
ations to  simple  zeros  of  smooth  functions.  As  an  application, 
we  give  fast  methods  for  evaluating  the  elementary  functions 
log(x),  exp(x),  sin(x)  etc.  to  high  precision.  For  example, 
if  x is  a positive  floating-point  number  with  an  n-bit  frac- 
tion, then  (under  rather  weak  assumptions)  an  n-bit  approxim- 
ation to  log(x)  or  exp(x)  may  be  computed  in  time  asymptot- 
ically equal  to  13M(n)log2n  as  n + “,  where  M(n)  is  the 
time  required  to  multiply  floating-point  numbers  with  n-bit 
fractions.  Similar  results  are  given  for  the  other  elementary 
functions,  and  some  analogies  with  operations  on  formal  power 
series  are  mentioned. 

1.  INTRODUCTION 

When  comparing  methods  for  solving  nonlinear  equations  or 
evaluating  functions,  it  is  customary  to  assume  that  the  basic 
arithmetic  operations  (addition,  multiplication,  etc.)  are 
performed  with  some  fixed  precision.  However,  an  irrational 
number  can  only  be  approximated  to  arbitrary  accuracy  if  the 
precision  is  allowed  to  increase  indefinitely.  Thus,  we  shall 
consider  iterative  processes  using  variable  precision.  Usually 
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the  precision  will  increase  as  the  computation  proceeds,  and 
the  final  result  will  be  obtained  to  high  precision.  Of 
course,  we  could  use  the  same  (high)  precision  throughout,  but 
then  the  computation  would  take  longer  than  with  variable  pre- 
cision, and  the  final  result  would  be  no  more  accurate. 

Assumptions 

For  simplicity  we  assume  that  a standard  multiple- 
precision  floating-point  number  representation  is  used,  with  a 
binary  fraction  of  n bits,  where  n is  large.  The  exponent 
length  is  fixed,  or  may  grow  as  o(n)  if  necessary.  To  avoid 
table-lookup  methods,  we  assune  a machine  with  a finite  random- 
access  memory  and  a fixed  number  of  sequential  tape  units. 
Formally,  the  results  hold  for  multitape  Turing  machines. 

Precision  n Operations 

An  operation  is  performed  with  precision  n if  the  operands 
and  result  are  floating-point  numbers  as  above  (i.e.,  precision 
n numbers),  and  the  relative  error  in  the  result  is  0(2”n)  . 

Precision  n Multiplication 

Let  M(n)  be  the  time  required  to  performe  precision  n 
multiplication.  (Time  may  be  regarded  as  the  number  of  single- 
precision operations,  or  the  number  of  bit  operations,  if 

2 

desired.)  The  classical  method  gives  M(n)  * 0(n  ) , but 
methods  which  are  faster  for  large  n are  known.  Asymptotic- 
ally the  fastest  method  known  is  that  of  Schonhage  and  Strassen 
[71] , which  gives 

(1.1)  M(n)  = O(n.log(n)loglog(n))  as  n 00 . 

Our  results  do  not  depend  on  the  algorithm  used  for 
multiplication,  provided  M(n)  satisfies  the  following  two 
conditions. 


3. 


(1.2)  n = o(M(n))  , i.e.,  lim  n/M(n)  = 0 ; 

n-*» 

and,  for  any  a > 0 , 

(1.3)  M(otn)  ~ aM(n)  , i.e.,  lim  - 1 . 

Condition  (1.2)  enables  us  to  neglect  additions,  since 
the  time  for  an  addition  is  0(n)  , which  is  asymptotically 
negligible  compared  to  the  time  for  a multiplication.  Condi- 
tion (1.3)  certainly  holds  if 

M(n)  ~ cn[log(n)]^[loglog(n)]Y  , 
though  it  does  not  hold  for  some  implementations  of  the 
Schonhage-Strassen  method.  We  need  (1.3)  to  estimate  the  con- 
stants in  the  asymptotic  "0"  results:  if  the  constants  are 

not  required  then  much  weaker  assumptions  suffice,  as  in  Brent 
[75a, b] . 

The  following  lemma  follows  easily  from  (1.3). 

Lemma  1.1 

If  0 < a < 1 , M(n)  = 0 for  n < 1 , and  c^  < < C2> 

then  00  . 

c.M(n)  < l M(a  n)  < c«M(n) 
k=0 

for  all  sufficiently  large  n . 

2.  BASIC  MULTIPLE-PRECISION  OPERATIONS 

In  this  section  we  summarize  some  results  on  the  time 
required  to  perform  the  multiple-precision  operations  of  div- 
ision, extraction  of  square  roots,  etc.  Additional  results  are 
given  in  Brent  [75a] . 

Reciprocals 

Suppose  a t 0 is  given  and  we  want  to  evaluate  a pre- 
cision n approximation  to  1/a  . Applying  Newton's  method  to 
the  equation 
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4. 


f(x)  = a - 1/x  ■ 0 
gives  the  well-known  iteration 


where 


xi+l  " xi  " xiei  * 


£i  * axi  “ 1 • 


Since  the  order  of  convergence  is  two,  only  k - log2n  iter- 
ations are  required  if  x^  is  a reasonable  approximation  to 
1/a  , e.g.,  a single-precision  approximation. 


If  e,  = 0(2"n)  , then  e. 


0(2  ' ) , so  at  the  last 


iteration  it  is  sufficient  to  perform  the  multiplication  of 
xk_1  by  Ek  l using  precision  n/2  , even  though  axk  1 must 
be  evaluated  with  precision  n . Thus,  the  time  required  for 
the  last  iteration  is  M(n)  + M(n/2)  + 0(n)  . The  time  for 
the  next  to  last  iteration  is  M(n/2)  + M(n/4)  + 0(n/2)  , since 
this  iteration  need  only  give  an  approximation  accurate  to 
0(2~n/2)  ^ and  soon  Thus,  using  Lemma  1.1,  the  total  time 
required  is 

I(n)  ~ (1  + ?)(1  + j + j ♦ ...)M(n)  - 3M(n) 
as  n -*■  « . 

Division 


Since  b/a  = b(l/a)  , precision  n division  may  be  done  in 


time 


as  n » . 


D(n)  ~ 4M(n) 


Inverse  Square  Roots 


Asymptotically  the  fastest  known  method  for  evaluating  a”** 
to  precision  n is  to  use  the  third-order  iteration 


xi+l 


3 _2, 


5. 


where 

n 

£ . = ax?  - 1 . 

1 l 

2 

At  the  last  iteration  it  is  sufficient  to  evaluate  axt  to 

2 T 9 

precision  n , to  precision  n/3  , and  x^(e^  - ~ e?)  to 
precision  2n/3  . Thus,  using  Lemma  1.1  as  above,  the  total 
time  required  is 

Q(n)  ~ (2  ♦ j + f)(l  + j + -■  + ...)M(n)  - 4^M(n) 
as  n *+  00  . 

Square  Roots 
Since 

a. a 2 if  a > 0 , 

0 if  a = 0 , 
we  can  evaluate  a to  precision  n in  time 


R(n)  ~ 5^M(n) 

as  n + ®.  Note  that  the  direct  use  of  Newton's  method  in  the 
form 


(2.1) 


xi*l  = 2(xi  * a/xi> 


or 

(2.2) 


x.  , = x.  + 
l+l  l 


t 2^ 

a - xt 


2x. 


is  asymptotically  slower,  requiring  time  ~ 8M(n)  or  ~ 6M(n) 
respectively. 


3.  VARIABLE-PRECISION  ZERO-FINDING  METHODS 


Suppose  is  a simple  zero  of  the  nonlinear  equation 

f(x)  =0  . 

Here,  f(x)  is  a sufficiently  smooth  function  which  can  be 
evaluated  near  C , with  absolute  error  0(2~n)  , in  time  w(n). 
We  consider  some  methods  for  evaluating  £ to  precision  n . 


Ml  !npWln|iiWf|^pi|»illi,<ju.y 
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6. 


Since  we  are  interested  in  results  for  very  large  n , the 

1 i 

tine  required  to  obtain  a good  starting  approximation  is 

neglected.  , l 

Assumptions 

To  obtain  shatp  results  we  need  the  following  two  assump- 
tions, which  are  similar  to  (1.2)  and  (1.3):  ; 

(3.1)  M(n)  ■ o(w(n))  , i.e.,  lim  M(n)/w(n)  * 0 ; 

n-*»  j 

and,  for  some  a >,  1 and  all  6 > 0 , 

(3.2)  w(8n)  - Baw(n)  1 

as  n ». 

From  (3.1),  the  time  required  for  a multiplication  is 
negligible  compared  to  the  time  for  a function  evaluation,  if 
n is  sufficiently  large.  (3.2)  implies  (3.1)  if  a > 1 , and  , 

(3.2)  certainly  holds  if,  for  example, 

w(n)  - cnallog(n)]Y[loglog(n)]^  . 

The  next  lemma  follows  from  our  assumptions  in  much  the 
same  way  as  Lemma  1.1. 

Lemma  3.1  j 

t 1 

If  0 < 8 < 1 , w(n)  = 0 for  n < 1 , and 

Cj  < 1/(1  - B°)  < c2  , j j 

then  _ I 

k i 1 

c.w(n)  < l w(8  n)  < c9w(n)  j 

k-°  I 

for  all  sufficiently  large  n . 3 

A Discrete  Newton  Method  I 

j 

To  illustrate  the  ideas  of  variable-precision  zero-finding 
methods,  we  describe  a simple  discrete  Newton  method.  More 

efficient  methods  are  described  in  the  next  three  sections,  and  j 





in  Brent  [75a] . 


Consider  the  iteration 

x.  . = x.  - f(x.)/g.  , 
i+I  1 i fei 

where  is  a one-sided  difference  approximation  to  f'(x^), 

i,e*’  f (x . + h.)  - f(x.) 

i r 

8i  h. 

i 

If  e.  = |x.  - cj  is  sufficiently  small,  f(x.)  is  evaluated 
with  absolute  error  0(ef)  , and  h^  is  small  enough  that 

(3.3)  g.  = f'(Xi)  + 0(e.)  , 

then  the  iteration  converges  to  C with  order  at  least  two. 

To  ensure  (3.3),  take  h^  of  order  , e.g.  h^  = f(x^)  . 

To  obtain  C to  precision  n,  we  need  two  evaluations  of 
f with  absolute  error  0(2~n)  , preceded  by  two  evaluations 
with  error  0(2  n^2)  , etc.  Thus,  the  time  required  is 

(3.4)  t(n)  ~ 2(1  + 2'a  + 2"2a  + ...)w(n)  . 

Asymptotic  Constants 

We  say  that  a zero -finding  method  has  asymptotic  constant 
C(a)  if,  to  find  a simple  zero  £ f 0 to  precision  n,  the 
method  requires  time  t(n)  ~ C(a)w(n)  as  n ■+  °°.  (The  asymp- 
totic constant  as  defined  here  should  not  be  confused  with  the 
"asymptotic  error  constant"  as  usually  defined  for  single- 
precision zero-finding  methods.) 

For  example,  from  (3.4),  the  discrete  Newton  method  des- 
cribed above  has  asymptotic  constant 

CN(oO  = 2/(1  - 2'“)  < 4 . 

Note  that  the  time  required  to  evaluate  £ to  precision  n is 
only  a small  multiple  of  the  time  required  to  evaluate  f(x) 
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with  absolute  error  0(2"n)  . (If  we  used  fixed  precision,  the 
time  to  evaluate  £ would  be  0(log(n))  times  the  time  to 
evaluate  f(x).) 

4.  A VARIABLE-PRECISION  SECANT  METHOD 

The  secant  method  is  known  to  be  more  efficient  than  the 
discrete  Newton  method  when  fixed-precision  arithmetic  is  used. 
The  same  is  true  with  variable-precision  arithmetic,  although 
the  ratio  of  efficiencies  is  no  longer  constant,  but  depends 
on  the  exponent  a in  (3.2).  Several  secant-like  methods  are 
described  in  Brent  [75a] ; here  we  consider  the  simplest  such 
method,  which  is  also  the  most  efficient  if  a < 4.5243...  . 

The  secant  iteration  is 

x.  - x.  , 

x = x . f _i III 

l+l  i l f7  - fi  ]L  * 

* • 

where  ^ = ffx^  , and  we  assume  that  the  function  evaluations 

are  performed  with  sufficient  accuracy  to  ensure  that  the  order 

1 h 

of  convergence  is  at  least  p = y(l  + 5 ) = 1.6180...  , the 
larger  root  of 

(4.1)  p2  = p + 1 . 

Let  e = |x^  j - c|  . Since  the  smaller  root  of  (4.1) 
lies  inside  the  unit  circle,  we  have 

x.  - C = 0(eP) 
and  2 

xi+1  - C = 0(ep  ) . 

To  give  order  p , f.  must  be  evaluated  with  absolute  error 
2 1 D 
0(ep  ) . Since  f^  = 0(|x^  - s|)  = 0(£M)  , it  is  also  necess- 
ary to  evaluate  (f.  - f.  .)/(x.  - x.  ..)  with  relative  error 
2 i l j.  x i“i 

0(eP  " P)  , but  lxi  " xi  j!  - z * so  it  is  necessary  to 
evaluate  f^  ^ with  absolute  error  0(ep  ”P  + 1)  . [Since 


f.  must  be  evaluated  with  absolute  error  0(ep  ) , f.  1 must 
be  evaluated  with  absolute  error  0(ep)  , but  p - p + 1 * 2 > p , 
so  this  condition  is  superfluous.] 

The  conditions  mentioned  are  sufficient  to  ensure  that 

the  order  of  convergence  is  at  least  p . Thus,  if  we  replace 

eP  by  2"n  , we  see  that  C may  be  evaluated  to  precision  n 

if  f is  evaluated  with  absolute  errors  0(2"n)  , 0(2~^nP  ), 

— 2no" ^ - 2no_4 

0(2  y ),  0(2  w ) It  follows  that  the  asymptotic 

constant  of  the  secant  method  is 

Cg(ot)  * 1 . (2p'2)°7(l  - p*“)  $ Cg Cl ) = 3 . 

The  following  lemma  states  that  the  secant  method  is 
asymptotically  more  efficient  than  the  discrete  Newton  method 
when  variable  precision  is  used. 

Lemma  4.1 

Cs(oO  < CN(a)  for  all  ail.  In  fact,  Cg(a)/CN(a) 
decreases  monotonically  from  (when  a = 1)  to  j (as 
a ■+■  °°) . 

5.  OTHER  VARIABLE-PRECISION  INTERPOLATORY  METHODS 

With  fixed  precision,  inverse  quadratic  interpolation  is 
more  efficient  than  linear  interpolation,  and  inverse  cubic 
interpolation  is  even  more  efficient,  if  the  combinatory  cost 
(i.e.,  "overhead")  is  negligible.  With  variable  precision  the 
situation  is  different.  Inverse  quadratic  interpolation  is 
slightly  more  efficient  than  the  secant  method,  but  inverse 
cubic  interpolation  is  not  more  efficient  than  inverse  quad- 
ratic interpolation  if  a £ 4.6056...  . Since  the  combinatory 
cost  of  inverse  cubic  interpolation  is  considerably  higher 
than  that  of  inverse  quadratic  interpolation,  the  inverse 

cubic  method  appears  even  worse  if  combinatory  costs  are  sig- 
nificant. 


10. 
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Inverse  Quadratic  Interpolation 

The  analysis  of  variable-precision  methods  using  inverse 

quadratic  interpolation  is  similar  to  that  for  the  secant 

method,  so  we  only  state  the  results.  The  order  p = 1.8392... 

3 2 

is  the  positive  root  of  p = p + p + 1 . It  is  convenient 
to  define  0 = 1/p  = 0.5436...  To  evaluate  £ to  precision 
n requires  evaluations  of  f to  (absolute)  precision  n, 

(1  - a + a2)n  , and  0^(1  - o - a2  + 2oJ)n  for  j=0,l,2,... 
Thus,  the  asymptotic  constant  is 

CQ(a)  ■ 1 ♦ (1  - o + a2)u  + (3a3)a/(l  - oa) 

* CQ(1)  = j(7  - 2a  - a2)  = 2.8085...  . 

Lemma  5.1 

Cq(oi)  < Cg(o0  for  all  a 5 1 . In  fact,  Cq(oO/Cs(cO 
increases  monotonically  from  0.9361...  (when  a = 1)  to 
1 (as  a ») . 

Inverse  Cubic  Interpolation,  etc. 

If  y = 0.5187...  is  the  positive  root  of 
4 3 2 

y + y + y + y = 1 , then  the  variable-precision  method  of 
order  1/y  = 1.9275...  , using  inverse  cubic  interpolation, 
has  asymptotic  constant 

cc(a)  = 1 + (1  - y + y2)a  + (1  - y - y2  + 2y3)a 

+ (4p4)a/(l  - ya) 

$ Cc(l)  = (13  - 6y  - 4y2  - 2y3)/3  = 2.8438...  . 

Note  that  C^(l)  > Cp(l)  . Variable-precision  methods 
using  inverse  interpolation  of  arbitrary  degree  are  described 
in  Brent  [75a],  Some  of  these  methods  are  slightly  more 
efficient  than  the  inverse  quadratic  interpolation  method  if 
a is  large,  but  inverse  quadratic  interpolation  is  the  most 
efficient  method  known  for  a < 4.6056...  . In  practice  a 
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is  usually  1,  1%  or  2. 

An  Open  Question 

Is  there  a method  with  asymptotic  constant  C(a)  such 
that  C(l)  < CQ(1)  ? 

6.  VARIABLE-PRECISION  METHODS  USING  DERIVATIVES 

In  Sections  3 to  5 we  considered  methods  for  solving  the 
nonlinear  equation  f(x)  = 0 , using  only  evaluations  of  f . 
Sometimes  it  is  easy  to  evaluate  f*(x)  , fH(x),  ...  once 
f(x)  has  been  evaluated,  and  the  following  theorem  shows  that 
it  is  possible  to  take  advantage  of  this.  For  an  application, 
see  Section  10. 

Theorem  6.1 

If  the  time  to  evaluate  f(x)  with  an  absolute  error 
0(2  n)  is  w(n)  , where  w(n)  satisfies  conditions  (3.1)  and 
(3.2),  and  (for  k=l,2,...)  the  time  to  evaluate  f^(x)  with 
absolute  error  0(2~n)  is  wk(n)  , where 

wk(n)  = o(w(n)) 

as  n -*■  ®,  then  the  time  to  evaluate  a simple  zero  £ 0 of 

f(x)  to  precision  n is 

t(n)  - w(n) 

as  n -*  °°. 

Proof 

For  fixed  k 2 1 , we  may  use  a direct  or  inverse  Taylor 
series  method  of  order  k ♦ 1 . The  combinatory  cost  is  of 
order  k.log(k  + l).M(n)  (see  Brent  and  Kung  [75]).  From 
(3.1),  this  is  o(w(n))  as  n -*■  00  . Thus, 

t(n)  f [1  - (k  ♦ w(n)  ♦ o(w(n)) 

J (1  ♦ o(l))w(n)  . 
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For  sufficiently  large  n , the  Ho(l)"  term  is  less  than  1/k, 

so  2 

t(n)  < (1  + ^-jw(n)  . 

Given  e > 0 , choose  k $ 2/e  . Then,  for  all  sufficiently 
large  n , 

w(n)  $ t (n)  $ (1  + e)w(n)  , 
so  t(n)  ~ w(n)  as  n 00  . 


Corollary  6.1 


If  the  conditions  of  Theorem  6.1  hold,  f:[a,b]  -►  I , 
f'(x)  t 0 for  x € [a,b]  , and  g is  the  inverse  function  of 
f , then  the  time  to  evaluate  g(y)  with  absolute  error 
0(2  n)  , for  y € I , is 

wg(n)  ~ w(n) 

as  n -+  00 . 


Note 

Corollary  6.1  is  optimal  in  the  sense  that,  if 

w (n)  ~ cw(n)  for  some  constant  c < 1 , then  w(n)  - cw  (n) 

8 2 8 

by  the  same  argument,  so  w(n)  - c w(n)  , a contradiction. 

Hence,  c = 1 is  minimal. 

7.  THE  ARITHMETIC-GEOMETRIC  MEAN  ITERATION 

Before  considering  the  multiple-precision  evaluation  of 
elementary  functions,  we  recall  some  properties  of  the  arith- 
metic-geometric (A-G)  mean  iteration  of  Gauss  [1876].  Starting 
from  any  two  positive  numbers  a^  and  b^  , we  may  iterate  as 
follows: 

a^+j  = y(a^  + b^)  (arithmetic  mean) 
and 

b^+j  = (ajb.)'2  (geometric  mean) 


for  i=0,l,. 


• • 


13. 


Second-order  Convergence 

The  A-G  mean  iteration  is  of  computational  interest 
because  it  converges  very  fast.  If  « a^  , then 

2tbi/a4)’i  . 

bi.l/ai.l  ‘ 1 + bi/ai  ‘ 2(Vai>  ' 

so  only  about  | log2 (aQ/b0) | iterations  are  required  before 
a^/b^  is  of  order  1 . Once  and  b^  are  close  together 

the  convergence  is  second  order,  for  if  b^/a^  * 1 - then 

ei+l  - l-b.+1/ai4l  - 1-2(1-c.)!s/(2-£.)  - e2/8  + 0(e?)  . 

Limit  of  the  A-G  Mean  Iteration 

There  is  no  essential  loss  of  generality  in  assuming  that 
aQ  * 1 and  bQ  * cos<J>  for  some  $ • If  a = lim  a.  = lim  b^ , 
then 

(7.1)  a 


i-*oo  1 i-+oo  1 


TT 


2m) 


where  K(<J>)  is  the  complete  elliptic  integral  of  the  first 
kind,  i.e.. 


tt/2 


m)  - ! (1  - sin2$sin2e)*^de  . 

0 

(A  simple  proof  of  (7.1)  is  given  in  Melzak  [73].) 

Also,  if  Cq  = sin<f>  , c^+J  = a^  - a^+j  (i*0,l,...),  then 

(7-2)  l 2i’1c 2 • 1 - Ittl-  . 

i=0  1 K(<^ 

where  E(<|))  is  the  complete  elliptic  integral  of  the  second 
kind,  i.e., 

ir/2 


2 1 • 2 
iin  i)sin  0' 

Both  (7.1)  and  (7.2)  were  known  by  Gauss. 


E(<J>)  = / (1  - sin^sin^0)A16  . 

0 


1 
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r 
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Legendre’s  Identity 

For  future  use,  we  note  the  identity 

(7.3)  K(4>)F.(<n  + K(4>’)E(4>)  - K(d>)KC4>*)  = y*  , 

where  <j)  + <J>*  = jTT  . (Legendre  [11]  proved  by  differentiation 
that  the  left  side  of  (7.3)  is  constant,  and  the  constant  may 
be  determined  by  letting  <f  + 0.) 

8.  FAST  MULTIPLE-PRECISION  EVALUATION  OF  7T 

The  classical  methods  for  evaluating  tt  to  precision  n 
2 

take  time  0(n  ):  see,  for  example.  Shanks  and  Wrench  [62], 

2 

Several  methods  which  are  asymptotically  faster  than  0(n  ) 

are  known.  For  example,  in  Brent  [75a]  a method  which  requires 
2 

time  0(M(n)log  (n))  is  described.  From  the  bound  (1.1)  on 

1+E 

M(n)  , this  is  faster  than  0(n  ) for  any  e > 0 . 

Asymptotically  the  fastest  known  methods  require  time 
0(M(n)log(n))  . One  such  method  is  sketched  in  Beeler  et  al 
[72].  The  method  given  here  is  faster,  and  does  not  require 
the  preliminary  computation  of  e . 

The  Gauss-Legendre  method 

Taking  <p  * 4> ’ = 7T/4  in  (7.3),  and  dividing  both  sides 
2 

by  it  , we  obtain 

(8.1)  [2K(tt/ 4)E(tt/ 4)  - K2(w/4)]/tt2  = -L  . 

_u 

However,  from  the  A-G  mean  iteration  with  a^  = 1 and  b^  = 2 , 

and  the  relations  (7.1)  and  (7.2),  we  can  evaluate  K(tt/4)/tt 
and  E(ir/4)/Tr  , and  thus  the  left  side  of  (8.1).  A division 
then  gives  tt  . (The  idea  of  using  (7.3)  in  this  way  occurred 
independently  to  Salamin  [75]  and  Brent  [75b].)  After  a little 
simplification,  we  obtain  the  following  algorithm  (written  in 
pseudo-Algol) : 


i 

i 

! 

\ 

i 

t 

j 


L 


A «-  1;  B 2"**;  T «-  1/4;  X 1; 
while  A - B > 2"n  do 

begin  Y <«-  A;  A y(A  + B) ; B *■  (BY)55; 

T «-  T - X(A  - Y)2; 

, X <-  2X 
end; 

return  A2/T  (or,  better,  (A  + B)2/(4T)]  . 

The  rate  of  convergence  is  illustrated  in  Table  8.1. 


Table  8.1: 

Convergence  of  the 

Gauss -Legendre  Method 

Iteration 

A2/T  - TT 

* - (A  + B)2/(4T) 

0 

8 . 6 ' - 1 

2.3'-l 

1 

4.6'-2 

to 

i 

o 

iH 

2 

8.8' -5 

7.4 ' -9 

3 

o 

1—1 

1 

• 

to 

o 

1 

•» 

00 

H 

4 

3.7'-21 

5.5' -41 

5 

5.5' -43 

2.4' -84 

6 

1.2 '-86 

2. 3' -171 

7 

5.8'-174 

1.1' -345 

8 

1.3'-348 

1.1' -694 

9 

6.9'-698 

6.1 '-1393 

Since  the  A-G  mean  iteration  converges  with  order  2,  we 
need  ~log2n  iterations  to  obtain  precision  n.  Each  iteration 
involves  one  (precision  n)  square  root,  one  multiplication, 
one  squaring,  one  multiplication  by  a power  of  two,  and  some 
additions.  Thus,  from  the  results  of  Section  2,  the  time 
required  to  evaluate  ir  is  ~ -y-M(n)log2n  . 

Comments 


1.  Unlike  Newton's  iteration,  the  A-G  mean  iteration  is  not 
self-correcting.  Thus,  we  cannot  start  with  low  precision 


16. 


I and  increase  it,  as  was  possible  in  Section  2. 

' 

2.  Since  there  are  ~log2n  iterations,  we  may  lose 
0(loglog(n))  bits  of  accuracy  through  accumulation  of  round- 

t 

ing  errors,  even  though  the  algorithm  is  numerically  stable. 
Thus,  it  may  be  necessary  to  work  with  precision  n + 

|0(loglog(n))  . From  (1.3),  the  time  required  is  still 
-~M(n)log2n  . 

9.  MULTIPLE-PRECISION  EVALUATION  OF  LOG(X) 

’ | There  are  several  algorithms  for  evaluating  log(x)  to 

| j precision  n in  time  0(M(n)log(n))  . For  example,  a method 

S‘  based  on  Landen  transformations  of  incomplete  elliptic 

integrals  is  described  in  Brent  [75b].  The  method  described 
here  is  essentially  due  to  Salamin  (see  Beeler  et  al  [72]), 
though  the  basic  relation  (9.1)  was  known  by  Gauss. 

j.  U 

■ ’ If  cos(4>)  = zl 2  is  small,  then 

l I (9.1)  K(<f>)  = (1  + 0(e))  log  (4e~h) 

i : _v 

Thus,  taking  aQ  = 1 , bQ  = 4/y  , where  y = 4e  2 , and 

S applying  the  A-G  mean  iteration  to  compute  a = lim  a^  , gives 

log(y)  = £ (l  ♦ 0(y-2)) 

[ for  large  y . Thus,  so  long  as  y >,  2 , we  can  evaluate 

\ log(y)  to  precision  n.  If  log(y)  = 0(n)  then  ~21og2n 

iterations  are  required,  so  the  time  is  ~13M(n)log2n  , 
assuming  ir  is  precomputed. 

I For  example,  to  find  log (10^)  we  start  the  A-G  mean 

ji,.  iteration  with  a^  - 1 and  b^  = 4 '-6  . Results  of  the  first 

seven  iterations  are  given  to  10  significant  figures  in  Table 

9.1.  We  find  that  Tr/(2a_)  = 13.81551056,  which  is  correct. 

; • 


1 


ji 


i \ 


il 


j 

5 

t 

\ 


n 


Table  9.1:  Computation 

of  log(106) 

i 

a. 

l 

b. 

l 

0 

1. 000000000 '0 

4. 000000000' -6 

1 

5.000020000'-! 

2. 000000000' -3 

2 

2.510010000' -1 

3. 162283985'-2 

3 

1.413119199'-1 

8 . 909188753' -2 

4 

1 . 152019037 ' -1 

1 . 122040359* -1 

5 

1 . 137029698 ' -1 

1.136930893'-1 

6 

1. 136980295' -1 

1. 136980294 '-1 

7 

1.136980295'-! 

1.136980295'-! 

Since 

log (2)  = — log(2n)  , we  can  evaluate  log (2) 

precision  n 

in  time  ~13M(n)log2n  . 

Suppose  x € [b,c] 

where  b > 

1 . We  may  set  y » 2 x 

, evaluate  log (y) 

above,  and  use  the  identity 

log (x)  = log(y)  - n. log(2) 

to  evaluate  log(x)  . Since  log(y)  =<  n.log(2)  , approximately 
log2n  significant  bits  will  be  lost  through  cancellation,  so 
it  is  necessary  to  work  with  precision  n + 0(log(n)). 

If  x is  very  close  to  1 , we  have  to  be  careful  in 
order  to  obtain  log(x)  with  a small  relative  error.  Suppose 
x = 1 + 6 . If  |6|  < 2”n/lo^n^  we  may  use  the  power  series 

log(l  + 6)  = 6 - 62/2  + 63/3  - , 

and  it  is  sufficient  to  take  about  log(n)  terms.  If  6 is 
larger,  we  may  use  the  above  A-G  mean  method,  with  working 
precision  n + 0(n/log(n))  to  compensate  for  any  cancellation. 

Finally,  if  0 < x < 1 , we  may  use  log(x)  = -log(l/x)  , 
where  log(l/x)  is  computed  as  above.  To  summarize,  we  have 
proved : 


Theorem  9.1 

If  x > 0 is  a precision  n number,  then  log(x)  may  be 
evaluated  to  precision  n in  time  ~13M(n)log2n  as  n -*■  » 
[assuming  tt  and  log (2)  precomputed  to  precision  n + 
0(n/log(n))] . 

Note:  The  time  required  to  compute  log(x)  by  the  obvious 

power  series  method  is  0(nM(n))  . Since  131og,tn  < n for 

n £ F3  , the  method  described  here  may  be  useful  for  moderate 

2 

n , even  if  the  classical  0(n  ) multiplication  algorithm  is 
used. 

10.  MULTIPLE-PRECISION  EVALUATION  OF  EXP(X) 

Corresponding  to  Theorem  9.1,  we  have: 

Theorem  10.1 

If  [a,b]  is  a fixed  interval,  and  x € [a,b]  is  a 
precision  n number  such  that  exp(x)  does  not  underflow  or 
overflow,  then  exp(x)  can  be  evaluated  to  precision  n in 
time  ~13M(n)log2n  as  n -*■  00  (assuming  ir  and  log (2)  are 
precomputed) . 

Proof 

To  evaluate  exp(x)  we  need  to  solve  the  equation 
f(y)  = 0 , where  f(y)  = log(y)  - x , and  x is  regarded  as 
constant.  Since 

f(k)(y)  . (-I)*-1  (k  - l)!y'k 

can  be  evaluated  in  time  0(M(n))  = o(M(n)log(n))  for  any 
fixed  k £ 1 , the  result  follows  from  Theorems  6.1  and  9.1. 
[The  (k  + l)-th  order  method  in  the  proof  of  Theorem  6.1  may 
simply  be  taken  as 

k 

yi+i = yi  E (x " 1°g(yi));,/j!  J 

11  j=0  1 


11.  MULTIPLE-PRECISION  OPERATIONS  ON  COMPLEX  NUMBERS 


Before  considering  the  multiple-precision  evaluation  of 
trigonometric  functions,  we  need  to  state  some  results  on 
multiple-precision  operations  with  complex  numbers.  We  assume 
that  a precision  n complex  number  z = x + iy  is  represented 
as  a pair  (x,  y)  of  precision  n real  numbers.  As  before,  a 
precision  n operation  is  one  which  gives  a result  with  a 
relative  error  0(2  n)  . (Now,  of  course,  the  relative  error 
may  be  complex,  but  its  absolute  value  must  be  0(2~n).)  Note 
that  the  smaller  component  of  a complex  result  may  occasionally 
have  a large  relative  error,  or  even  the  wrong  sign! 

Complex  Multiplication 

Since  z = (t  + iu)  (v  + iw)  = (tv  - uw)  + i(tw  + uv)  , a 
complex  multiplication  may  be  done  with  four  real  multiplic- 
ations and  two  additions.  However,  we  may  use  an  idea  of 
Karatsuba  and  Ofman  [62]  to  reduce  the  work  required  to  three 
real  multiplications  and  some  additions:  evaluate  tv  , uw  , 

and  (t  + u)  (v  + w)  , then  use 

tw  + uv  = (t  + u) (v  + w)  - (tv  + uw)  . 

Since  jt  + u|  < 2^|t  + iu|  and  |v  + w|  $ 2**^  + iw(,  we 
have 

l(t  + u)(v  + w)j  $ 2 1 z | . 

Thus,  all  rounding  errors  are  of  order  2"n|z|  or  less,  and 
the  computed  product  has  a relative  error  0(2~n)  . The  time 
for  the  six  additions  is  asymptotically  negligible  compared  to 
that  for  the  three  multiplications,  so  precision  n complex 
multiplication  may  be  performed  in  time  ~3M(n)  . 

Complex  Squares 

2 

Since  (v  + iw)  = (v  - w) (v  + w)  + 2ivw  , a complex 


square  may  be  evaluated  with  two  real  multiplications  and 
additions,  in  time  ~2M(n)  . 

Complex  Division 

Using  complex  multiplication  as  above,  and  the  same  div- 
ision algorithm  as  in  the  real  case,  we  can  perform  complex 
division  in  time  ~12M(n)  . However,  it  is  faster  to  use  the 
identity 

t + iu  , 2 2.  -1 , . . . , . . , 

v + iw  * (v  + w ) [(t  + iu)(v  - iw)]  , 

reducing  the  problem  to  one  complex  multiplication,  four  real 
multiplications,  one  real  reciprocal,  and  some  additions. 

This  gives  time  -10M(n)  . For  complex  reciprocals  we  have 
t = 1 , u = 0 , and  time  ~7M(n)  . 

Complex  Square  Roots 

Using  (2.2)  requires,  at  the  last  iteration,  one  precision 
n complex  squaring  and  one  precision  n/2  complex  division.  Thus 
the  time  required  is  ~2(2  + 10/2)M(n)  = 14M(n)  . 

Complex  A-G  Mean  Iteration 

From  the  above  results,  a complex  square  root  and  multip- 
lication may  be  performed  in  time  ~17M(n)  . Each  iteration 
transforms  two  points  in  the  complex  plane  into  two  new  points, 
and  has  an  interesting  geometric  interpretation. 

12.  MULTIPLE-PRECISION  EVALUATION  OF  TRIGONOMETRIC  FUNCTIONS 

Since 

(12.1)  log(v  + iw)  = log|v  + Iw  | + i.artan(w/v) 
and 

(12.2)  exp(i6)  = cos(0)  + i.sin(0)  , 

we  can  evaluate  artan(x)  , cos(x)  and  sin(x)  if  we  can 
evaluate  log(z)  and  exp(z)  for  complex  arguments  z . This 


i*  f'li. 


i. 


ijirr  irmirii  i niUnm'.iikiti 


may  be  done  just  as  described  above  for  real  z , provided  we 

choose  the  correct  value  of  (a^bj)  . Some  care  is  necessary 

to  avoid  excessive  cancellation;  for  example,  we  should  use  the 
power  series  for  sin(x)  if  |x|  is  very  small,  as  described 

above  for  log(l  + 6)  . Since  '•21og2n  A-G  mean  iterations 

are  required  to  evaluate  log(z)  , and  each  iteration  requires 
time  ~17M(n)  , we  can  evaluate  log(z)  in  time  ~34M(n)log2n  . 
From  the  complex  version  of  Theorem  6.1,  exp(z)  may  also  be 
evaluated  in  time  ~34M(n)log2n  . 

As  an  example,  consider  the  evaluation  of  log(z)  for 
z = 106(2  + i)  . The  A-G  mean  iteration  is  started  with 
a^  = 1 and  b^  = 4/z  = 1.6' -6  - (8.0'-7)i  . The  results  of 
six  iterations  are  given,  to  8 significant  figures,  in  Table 
12.1. 


Table  12.1:  Evaluation  of 

log  106(2  + i). 

j 

a. 

J 

b. 

0 

(1. 0000000' 0, 

0. 0000000' 0) 

(1.6000000' -6, 

-8. 0000000 '-7) 

1 

(5. 0000080 '-1, 

-4. 0000000 '-7) 

(1.3017017'-3, 

-3.0729008 '-4) 

2 

(2.5065125* -1, 

-1.5384504* -4) 

(2. 5686505 '-2, 

-2.9907884 '-3) 

3 

(1 . 3816888 • -1 , 

-1.5723167 '-3) 

(8. 0373334 '-2, 

-4.6881008'-3) 

4 

(1.0927111 ' -1 , 

-3. 1302088' -3) 

(1. 0540970' -1, 

-3.6719673' -3) 

5 

(1 . 0734040' -1 , 

-3.4010880’ -3) 

(1 .0732355 ' -1 , 

-3.4064951 ’-3) 

6 

(1 .0733198 ' -1 , 

-3,4037916' -3) 

(1 . 0733198 ' -1 , 

-3.4037918 ' -3) 

We  find  that  = 14.620230  + 0.46364761i 

7 i 

- log | z | + i.artan(j) 


as  expected. 

Another  method  for  evaluating  trigonometric  functions  in 
time  0(M(n)log(n))  , without  using  the  identities  (12.1)  and 
(12.2),  is  described  in  Brent  [75b]. 

13.  OPERATIONS  ON  FORMAL  POWER  SERIES 


There  is  an  obvious  similarity  between  a multiple- 
precision  number  with  base  fL  : 

3e  l a.3"X  (0  < a < 3)  , 
i=l  A 

and  a formal  power  series: 

00 

£ a.x1  (a.  real,  x an  indeterminate)  . 
i=0  1 

Thus,  it  is  not  surprising  that  algorithms  similar  to  those 
described  in  Section  2 may  be  used  to  perform  operations  on 
power  series. 


In  this  section  only,  M(n)  denotes  the  number  of  scalar 
operations  required  to  evaluate  the  first  n coefficients 
Cq,  . . . ,c  j in  the  formal  product 


00 

I 

i=0 


uu 

I c.x1  . 
i=0  1 


Clearly, 

fact 


Cj  depends  only  on  sl^,  . . . and  b^, 

j 

c.  = J a.b.  . . 

J iio  1 


2 

The  classical  algorithm  gives  M(n)  = 0(n  ) , but  it  is  poss- 
ible to  use  the  fast  Fourier  transform  (FFT)  to  obtain 


M(n)  = Ofn.log (n))  . 

(see  Borodin  [73]). 


If  we  assume  that  M(n)  satisfies  conditions  (1.2)  and 


(1.3),  then  the  time  bounds  given  in  Section  2 for  division, 
square  roots,  etc.  of  multiple-precision  numbers  also  apply  for 
the  corresponding  operations  on  power  series  (where  we  want  the 
first  n terms  in  the  result).  For  example,  if 

°o  ^ 

P(x)  = £ a.x1  and  an  t 0 , then  the  first  n terms  in  the 

i=0  1 u 

expansion  of  1/P(x)  may  be  found  with  ~3M(n)  operations  as 
n ■+■  ».  However,  some  operations,  e.g.  computing  exponentials, 
are  much  easier  for  power  series  than  for  multiple-precision 
numbers ! 

Evaluation  of  log(P(x)) 

If  aQ  > 0 we  may  want  to  compute  the  first  n terms  in 
the  power  series  Q(x)  = log(P(x)).  Since  Q(x)  = log(aQ)  + 
log(P(x)/an)  , there  is  no  loss  of  generality  in  assuming  that 

U oo 

a_  * 1 . Suppose  Q(x)  = ][  b.x1  . From  the  relation 

U i-0  1 

(13.1)  Q’(x)  = P'(x)/P(x)  , 

where  the  prime  denotes  formal  differentiation  with  respect  to 
x , we  have 


The  first  n terms  in  the  power  series  for  the  right  side  of 
(13.2)  may  be  evaluated  with  ~4M(n)  operations,  and  then  we 
need  only  compare  coefficients  to  find  b^,...,b  j . (Since 
aQ  = 1 , we  know  that  bg  = 0.)  Thus,  the  first  n terms  in 
log(P(x))  may  be  found  in  ~4M(n)  operations.  It  is  inter- 
esting to  compare  this  result  with  Theorem  9.1. 

Evaluation  of  exp(P(x)) 

If  R(x)  = exp(P(x))  then  R(x)  = exp(aQ)exp(P(x)  - aQ)  , 
so  there  is  no  loss  of  generality  in  assuming  that  = 0 . 
Now  log(R(x))  - P(x)  = 0 , and  we  may  regard  this  as  an 
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equation  for  the  unknown  power  series  R(x)  , and  solve  it  by 
one  of  the  usual  iterative  methods.  For  example,  Newton’s 
method  gives  the  iteration 

(13.3)  Ri+1(x)  = Ri(x)  - R.(x)(log(R. (x))  - P(x))  . 

If  we  use  the  starting  approximation  RQ(x)  = 1 , then  the 

terms  in  R,  (x)  agree  exactly  with  those  in  R(x)  up  to  (but 

excluding)  the  term  0(x^  ) . Thus,  using  (13.3),  we  can  find 

the  first  n terms  of  exp(P(x))  in  ~9M(n)  operations,  and 
. . . 22 

it  is  possible  to  reduce  this  to  — M(n)  operations  by  using 
a fourth-order  method  instead  of  (13.3).  Compare  Theorem  10.1. 

Evaluation  of  Pm 

Suppose  we  want  to  evaluate  (P(x))m  for  some  large 
positive  integer  m . We  can  assume  that  aQ  ^ 0 , for  other- 
wise some  power  of  x may  be  factored  out.  Also,  since 
Pm  = aQ(P/a0)ra  , we  can  assume  that  aQ  = 1 . By  forming  , 

4 Q 

P and  then  the  appropriate  product  given  by  the 

binary  expansion  of  m , we  can  find  the  first  n terms  of 
P®  in  0(M(n)log2m)  operations.  Surprisingly,  this  is  not 
the  best  possible  result,  at  least  for  large  m . From  the 
identity 

(13.4)  P®  = exp (m. log (P)) 

and  the  above  results,  we  can  find  the  first  n terms  of  Pm 

in  0(M(n))  operations!  (If  a^  j 1 , we  also  need  0(log2m) 

operations  to  evaluate  a®  .)  If  the  methods  described  above 

are  used  to  compute  the  exponential  and  logarithm  in  (13.4), 
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then  the  number  of  operations  is  '•-j-  M(n)  as  n -*■  ». 

Other  Operations  on  Power  Series 

The  method  used  to  evaluate  log(P(x))  can  easily  be 
generalized  to  give  a method  for  f(P(x))  , where  df(t)/dt 


m 


is  a function  of  t which  may  be  written  in  terms  of  square 

roots,  reciprocals  etc.  For  example,  with  f (t)  = artan(t) 

2 

we  have  df/dt  * 1/(1  ♦ t ) , so  it  is  easy  to  evaluate 
artan(P(x)).  Using  Newton's  method  we  can  evaluate  the 
inverse  function  f^”^(P(x))  if  f(P(x))  can  be  evaluated. 
Generalizations  and  applications  are  given  in  Brent  and  Kung 
[75]. 

Some  operations  on  formal  power  series  do  not  correspond 
to  natural  operations  on  multiple-precision  numbers.  One 
example,  already  mentioned  above,  is  formal  differentiation. 
Other  interesting  examples  are  composition  and  reversion.  The 
classical  composition  and  reversion  algorithms,  as  given  in 

3 

Knuth  [69],  are  0(n  ) , but  much  faster  algorithms  exist:  see 
Brent  and  Kung  [75] . 
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